152 research outputs found
Graph- and finite element-based total variation models for the inverse problem in diffuse optical tomography
Total variation (TV) is a powerful regularization method that has been widely
applied in different imaging applications, but is difficult to apply to diffuse
optical tomography (DOT) image reconstruction (inverse problem) due to complex
and unstructured geometries, non-linearity of the data fitting and
regularization terms, and non-differentiability of the regularization term. We
develop several approaches to overcome these difficulties by: i) defining
discrete differential operators for unstructured geometries using both finite
element and graph representations; ii) developing an optimization algorithm
based on the alternating direction method of multipliers (ADMM) for the
non-differentiable and non-linear minimization problem; iii) investigating
isotropic and anisotropic variants of TV regularization, and comparing their
finite element- and graph-based implementations. These approaches are evaluated
on experiments on simulated data and real data acquired from a tissue phantom.
Our results show that both FEM and graph-based TV regularization is able to
accurately reconstruct both sparse and non-sparse distributions without the
over-smoothing effect of Tikhonov regularization and the over-sparsifying
effect of L regularization. The graph representation was found to
out-perform the FEM method for low-resolution meshes, and the FEM method was
found to be more accurate for high-resolution meshes.Comment: 24 pages, 11 figures. Reviced version includes revised figures and
improved clarit
Variational and PDE-based methods for image processing
In this thesis, we study modern variational and partial differential equation (PDE)-based methods for three image analysis applications, namely, image denoising, image segmentation, and surface reconstruction from point clouds. A common feature these applications have is the use of novel variational formulations.
For image denoising, we focus on higher order variational functionals in which the regulariser incorporates second order derivatives or is a sophisticated combination of first and second order derivatives. We study seven representative first and/or second order functionals, implement them using the efficient split Bregman algorithm, and compare their performances. With the knowledge of the main properties of each of the denoising approaches, we can then select and adapt them for image segmentation.
For image segmentation, we are in particular interested in images of three types: red blood cell (RBC) images, histology images of the microglial cells, and optical coherence tomography (OCT) images of the retina. For RBC images we develop an automated and accurate image analysis framework for an image-based cytometer that uses variational total generalised variation, adaptive thresholding and support vector machine. The framework can 1) detect and numerically count malaria parasite infected RBCs acquired from Giemsa-stained smears; 2) classify all parasitic subpopulations by quantifying the area occupied by the parasites within the infected cells; 3) predict if the RBC image has been infected by malaria parasites. We show the effectiveness of the framework by quantifying and classifying both RBC and infected RBC images.
For histology images of the microglial cells, we introduce an automated image segmentation method that is capable of efficiently extracting microglial cells from the images. The method uses variational Mumford-Shah total variation and split Bregman for image denoising and segmentation and is fast, accurate and robust against noise and inhomogeneity in the image. We evaluate the method on the image data from wild type mice and transgenic mouse models of Alzheimer's disease. The method is scalable to large datasets, allowing microglia analysis in regions of interest and across the whole brain.
For OCT images of the retina, we propose a novel and accurate geodesic distance method to segment healthy and pathological OCT images, in both two and three dimensions. The method uses a weighted geodesic distance by an exponential function, taking into account horizontal and vertical intensity variations. The fast sweeping method is used to derive the geodesic distance from an Eikonal equation, a special case of Hamilton-Jacobi equations that belongs to the family of nonlinear PDEs. Segmentation is then achieved by solving an ordinary differential equation using the resulting geodesic distance. The proposed method is also extensively compared with the parametric active contour model and graph theoretic methods.
Finally, we study surface reconstruction from point clouds. We treat this reconstruction problem as an image segmentation problem and hence develop a novel variational level set method. Th method is capable of reconstructing implicit surfaces from unorganised point clouds while preserving fine details of the surfaces. A distance function, derived from the point cloud using the fast sweeping algorithm, is used as an edge indicator function and to find an initial image enclosed by the point cloud. A novel variational segmentation functional is then proposed that effectively integrates the initial image and edge indicator. Gradient descent optimisation finally minimises the functional and ensures an accurate and smooth reconstruction
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Existing audio-visual event localization (AVE) handles manually trimmed
videos with only a single instance in each of them. However, this setting is
unrealistic as natural videos often contain numerous audio-visual events with
different categories. To better adapt to real-life applications, in this paper
we focus on the task of dense-localizing audio-visual events, which aims to
jointly localize and recognize all audio-visual events occurring in an
untrimmed video. The problem is challenging as it requires fine-grained
audio-visual scene and context understanding. To tackle this problem, we
introduce the first Untrimmed Audio-Visual (UnAV-100) dataset, which contains
10K untrimmed videos with over 30K audio-visual events. Each video has 2.8
audio-visual events on average, and the events are usually related to each
other and might co-occur as in real-life scenes. Next, we formulate the task
using a new learning-based framework, which is capable of fully integrating
audio and visual modalities to localize audio-visual events with various
lengths and capture dependencies between them in a single pass. Extensive
experiments demonstrate the effectiveness of our method as well as the
significance of multi-scale cross-modal perception and dependency modeling for
this task.Comment: Accepted by CVPR202
Introducing anisotropic tensor to high order variational model for image restoration
Second order total variation (SOTV) models have advantages for image restoration over their first order counterparts including their ability to remove the staircase artefact in the restored image. However, such models tend to blur the reconstructed image when discretised for numerical solution [1–5]. To overcome this drawback, we introduce a new tensor weighted second order (TWSO) model for image restoration. Specifically, we develop a novel regulariser for the SOTV model that uses the Frobenius norm of the product of the isotropic SOTV Hessian matrix and an anisotropic tensor. We then adapt the alternating direction method of multipliers (ADMM) to solve the proposed model by breaking down the original problem into several subproblems. All the subproblems have closed-forms and can be solved efficiently. The proposed method is compared with state-of-the-art approaches such as tensor-based anisotropic diffusion, total generalised variation, and Euler's elastica. We validate the proposed TWSO model using extensive experimental results on a large number of images from the Berkeley BSDS500. We also demonstrate that our method effectively reduces both the staircase and blurring effects and outperforms existing approaches for image inpainting and denoising applications
-net: Ensembled Iterative Deep Neural Networks for Accelerated Parallel MR Image Reconstruction
We explore an ensembled -net for fast parallel MR imaging, including
parallel coil networks, which perform implicit coil weighting, and sensitivity
networks, involving explicit sensitivity maps. The networks in -net are
trained in a supervised way, including content and GAN losses, and with various
ways of data consistency, i.e., proximal mappings, gradient descent and
variable splitting. A semi-supervised finetuning scheme allows us to adapt to
the k-space data at test time, which, however, decreases the quantitative
metrics, although generating the visually most textured and sharp images. For
this challenge, we focused on robust and high SSIM scores, which we achieved by
ensembling all models to a -net.Comment: fastMRI challenge submission (team: holykspace
Fourier-Net+: Leveraging Band-Limited Representation for Efficient 3D Medical Image Registration
U-Net style networks are commonly utilized in unsupervised image registration
to predict dense displacement fields, which for high-resolution volumetric
image data is a resource-intensive and time-consuming task. To tackle this
challenge, we first propose Fourier-Net, which replaces the costly U-Net style
expansive path with a parameter-free model-driven decoder. Instead of directly
predicting a full-resolution displacement field, our Fourier-Net learns a
low-dimensional representation of the displacement field in the band-limited
Fourier domain which our model-driven decoder converts to a full-resolution
displacement field in the spatial domain. Expanding upon Fourier-Net, we then
introduce Fourier-Net+, which additionally takes the band-limited spatial
representation of the images as input and further reduces the number of
convolutional layers in the U-Net style network's contracting path. Finally, to
enhance the registration performance, we propose a cascaded version of
Fourier-Net+. We evaluate our proposed methods on three datasets, on which our
proposed Fourier-Net and its variants achieve comparable results with current
state-of-the art methods, while exhibiting faster inference speeds, lower
memory footprint, and fewer multiply-add operations. With such small
computational cost, our Fourier-Net+ enables the efficient training of
large-scale 3D registration on low-VRAM GPUs. Our code is publicly available at
\url{https://github.com/xi-jia/Fourier-Net}.Comment: Under review. arXiv admin note: text overlap with arXiv:2211.1634
- …